In total we have data from 2303 participants. Of those 2298 participants finished the experiment or failed at the captcha stage (i.e., did not abandon the study throughout). Of those, 2278 solved two or more captchas and proceeded to the roulette part. All following analyses are based on these 2278 participants.
The distribution of conditions is as follows (prop = proportion):
## # A tibble: 3 × 3
## expt_cond n prop
## <fct> <int> <dbl>
## 1 No Message 755 0.331
## 2 Banner 796 0.349
## 3 Popup & Banner 727 0.319
Participants with and without experience in online roulette:
## # A tibble: 2 × 3
## roulette n prop
## <fct> <int> <dbl>
## 1 Yes 1786 0.784
## 2 No 492 0.216
Some demographic information:
## # A tibble: 5 × 3
## gender n prop
## <chr> <int> <dbl>
## 1 Female 965 0.424
## 2 Male 1289 0.566
## 3 Non-binary 15 0.00658
## 4 None 5 0.00219
## 5 Prefer Not to Disclose 4 0.00176
## vars n mean sd median trimmed mad min max range skew kurtosis se
## age 1 2276 35.94 10.98 34 34.99 10.38 18 87.0 69.0 0.82 0.47 0.23
## bonus 2 2278 4.91 3.45 5 4.79 1.04 0 79.4 79.4 7.56 123.73 0.07
## bet_count 3 2278 5.97 12.94 3 3.39 4.45 0 198.0 198.0 7.00 70.84 0.27
Some conditional demographic information:
## # A tibble: 6 × 4
## gender expt_cond n prop
## <chr> <fct> <int> <dbl>
## 1 Female No Message 318 0.140
## 2 Female Banner 329 0.144
## 3 Female Popup & Banner 318 0.140
## 4 Male No Message 427 0.187
## 5 Male Banner 462 0.203
## 6 Male Popup & Banner 400 0.176
##
## Descriptive statistics by group
## expt_cond: No Message
## vars n mean sd median trimmed mad min max range skew kurtosis se
## expt_cond* 1 755 1.00 0.00 1 1.00 0.00 1 1.0 0.0 NaN NaN 0.00
## age 2 754 35.82 10.71 34 34.82 9.64 18 81.0 63.0 0.92 0.82 0.39
## bonus 3 755 4.98 4.18 5 4.79 1.48 0 79.4 79.4 9.52 147.49 0.15
## bet_count 4 755 6.22 13.10 3 3.57 4.45 0 187.0 187.0 6.64 65.17 0.48
## ----------------------------------------------------------------------------------
## expt_cond: Banner
## vars n mean sd median trimmed mad min max range skew kurtosis se
## expt_cond* 1 796 2.00 0.00 2 2.00 0.00 2 2.0 0.0 NaN NaN 0.00
## age 2 795 36.20 11.12 34 35.28 10.38 19 87.0 68.0 0.79 0.31 0.39
## bonus 3 796 4.95 3.43 5 4.81 0.74 0 46.6 46.6 5.20 50.70 0.12
## bet_count 4 796 5.57 10.36 3 3.33 4.45 0 104.0 104.0 4.67 28.74 0.37
## ----------------------------------------------------------------------------------
## expt_cond: Popup & Banner
## vars n mean sd median trimmed mad min max range skew kurtosis se
## expt_cond* 1 727 3.00 0.00 3 3.00 0.00 3 3 0 NaN NaN 0.00
## age 2 727 35.79 11.10 34 34.85 10.38 18 78 60 0.77 0.29 0.41
## bonus 3 727 4.80 2.52 5 4.78 0.89 0 20 20 1.28 7.20 0.09
## bet_count 4 727 6.14 15.15 3 3.30 4.45 0 198 198 7.56 72.68 0.56
Now let’s take a look at the PGSI scores:
## # A tibble: 1 × 2
## pgsi_mean pgsi_sd
## <dbl> <dbl>
## 1 2.49 3.76
## # A tibble: 3 × 3
## expt_cond pgsi_mean pgsi_sd
## <fct> <dbl> <dbl>
## 1 No Message 2.51 3.90
## 2 Banner 2.45 3.76
## 3 Popup & Banner 2.52 3.61
Of the total of 2278 participants, 579 participants did not bet at all. We can take a look at the bonus as a function of whether or not participant bet:
## # A tibble: 2 × 7
## bet_at_all mean median sd n min max
## <lgl> <dbl> <dbl> <dbl> <int> <dbl> <dbl>
## 1 FALSE 5 5 0 579 5 5
## 2 TRUE 4.89 5 4.00 1699 0 79.4
## # A tibble: 6 × 8
## # Groups: expt_cond [3]
## expt_cond bet_at_all mean median sd n min max
## <fct> <lgl> <dbl> <dbl> <dbl> <int> <dbl> <dbl>
## 1 No Message FALSE 5 5 0 186 5 5
## 2 No Message TRUE 4.98 5 4.81 569 0 79.4
## 3 Banner FALSE 5 5 0 212 5 5
## 4 Banner TRUE 4.94 5 4.00 584 0 46.6
## 5 Popup & Banner FALSE 5 5 0 181 5 5
## 6 Popup & Banner TRUE 4.74 4.9 2.90 546 0 20
Note, the expected loss when betting in our roulette task is 1/37, so participants are expected to retain 36/37 when betting:
\[£5 \times \frac{36}{37} \approx £4.86\]
Next let’s take a look at our main DV, proportion of money bet, which is defined as follows: \[\texttt{prop_bet} = \frac{\texttt{amount}}{5 + \texttt{total_win}}\]
## # A tibble: 3 × 3
## expt_cond prop_bet_mean prop_bet_sd
## <fct> <dbl> <dbl>
## 1 No Message 0.344 0.322
## 2 Banner 0.323 0.321
## 3 Popup & Banner 0.316 0.319
Our DV clearly does not look normally distributed.
## # A tibble: 1 × 3
## gamble_at_all gamble_everything proportion_bet_rest
## <dbl> <dbl> <dbl>
## 1 0.746 0.122 0.361
## # A tibble: 1 × 3
## no_gamble gamble_at_all gamble_everything
## <int> <int> <int>
## 1 579 1699 208
Binomial confidence or credibility intervals for the probability to gamble at all:
## method x n mean lower upper
## 1 agresti-coull 1699 2278 0.7458297 0.7275419 0.7632898
## 2 asymptotic 1699 2278 0.7458297 0.7279503 0.7637091
## 3 bayes 1699 2278 0.7457218 0.7277920 0.7635307
## 4 cloglog 1699 2278 0.7458297 0.7274299 0.7631969
## 5 exact 1699 2278 0.7458297 0.7274217 0.7636032
## 6 logit 1699 2278 0.7458297 0.7275397 0.7632913
## 7 probit 1699 2278 0.7458297 0.7276259 0.7633743
## 8 profile 1699 2278 0.7458297 0.7276804 0.7634263
## 9 lrt 1699 2278 0.7458297 0.7276681 0.7634233
## 10 prop.test 1699 2278 0.7458297 0.7273225 0.7634990
## 11 wilson 1699 2278 0.7458297 0.7275467 0.7632850
Distribution per condition:
We use a custom parameterization of a zero-one-inflated beta-regression model (see also here). The likelihood of the model is given by:
\[\begin{align} f(y) &= (1 - g) & & \text{if } y = 0 \\ f(y) &= g \times e & & \text{if } y = 1 \\ f(y) &= g \times (1 - e) \times \text{Beta}(a,b) & & \text{if } y \notin \{0, 1\} \\ a &= \mu \times \phi \\ b &= (1-\mu) \times \phi \end{align}\]
Where \(1 - g\) is the zero inflation probability, zipp is \(g\) and reflects the probability to gamble, \(e\) is the conditional one-inflation probability (coi) or conditional probability to gamble everything (i.e., conditional probability to have a value of one, if one gambles), \(\mu\) is the mean of the beta distribution (Intercept), and \(\phi\) is the precision of the beta distribution (phi). As we use Stan for modelling, we need to model on the real line and need appropriate link functions. For \phi the link is log (inverse is exp()), for all other parameters it is logit (inverse is plogis()).
We fit this model and add experimental condition as a factor to the three main model parameters (i.e., only the precision parameter is fixed across conditions). The following table provides the overview of the model and all model parameters and show good convergence.
## Family: zoib2
## Links: mu = logit; phi = log; zipp = logit; coi = logit
## Formula: prop_bet ~ 0 + Intercept + expt_cond
## phi ~ 1
## zipp ~ 0 + Intercept + expt_cond
## coi ~ 0 + Intercept + expt_cond
## Data: duse (Number of observations: 2278)
## Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
## total post-warmup draws = 1e+05
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## phi_Intercept 1.18 0.03 1.11 1.24 1.00 108609 75461
## Intercept -0.45 0.04 -0.53 -0.36 1.00 70546 71228
## expt_condBanner -0.10 0.06 -0.22 0.02 1.00 78123 76045
## expt_condPopup&Banner -0.11 0.06 -0.23 0.01 1.00 79997 76023
## zipp_Intercept 1.12 0.08 0.96 1.29 1.00 72578 70218
## zipp_expt_condBanner -0.11 0.12 -0.33 0.12 1.00 80464 74831
## zipp_expt_condPopup&Banner -0.01 0.12 -0.25 0.22 1.00 82014 76326
## coi_Intercept -1.99 0.13 -2.25 -1.74 1.00 70881 68608
## coi_expt_condBanner 0.07 0.18 -0.29 0.42 1.00 78531 71854
## coi_expt_condPopup&Banner -0.04 0.19 -0.41 0.32 1.00 80965 75928
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
As a visual convergence check, we plot the density and trace plots for the four intercept parameters representing the no message (control) condition or the overall mean (for phi).
The model does not have any obvious problems, even without priors for the condition specific effects.
As expected the synthetic data generated from the model looks a lot like the actual data. This suggests that the model is adequate for the data.
Our hypothesis is a bout proportion bet, \(Pr_{bet}\) which is given by:
\[Pr_{bet} = (g * e) + (g * (1-e) * \mu)\]
The following show the resulting \(Pr_{bet}\) posterior distributions across conditions.
## # A tibble: 3 × 7
## expt_cond prop_bet .lower .upper .width .point .interval
## <fct> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 No Message 0.349 0.327 0.373 0.95 mean qi
## 2 Banner 0.328 0.306 0.351 0.95 mean qi
## 3 Popup & Banner 0.329 0.306 0.352 0.95 mean qi
## # A tibble: 2 × 7
## expt_cond prop_bet .lower .upper .width .point .interval
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Banner - No Message -0.0212 -0.0534 0.0111 0.95 mean qi
## 2 Popup & Banner - No Message -0.0203 -0.0526 0.0123 0.95 mean qi
Mu:
## expt_cond response lower.HPD upper.HPD
## No Message 0.390 0.369 0.410
## Banner 0.366 0.347 0.386
## Popup & Banner 0.364 0.344 0.384
##
## Point estimate displayed: median
## Results are back-transformed from the logit scale
## HPD interval probability: 0.95
g:
## expt_cond response lower.HPD upper.HPD
## No Message 0.754 0.723 0.784
## Banner 0.734 0.703 0.764
## Popup & Banner 0.751 0.719 0.781
##
## Point estimate displayed: median
## Results are back-transformed from the logit scale
## HPD interval probability: 0.95
e:
## expt_cond response lower.HPD upper.HPD
## No Message 0.121 0.0942 0.148
## Banner 0.128 0.1017 0.156
## Popup & Banner 0.117 0.0906 0.144
##
## Point estimate displayed: median
## Results are back-transformed from the logit scale
## HPD interval probability: 0.95
## Family: zoib2
## Links: mu = logit; phi = log; zipp = logit; coi = logit
## Formula: prop_bet ~ expt_cond + pgsi_c
## phi ~ 1
## zipp ~ expt_cond + pgsi_c
## coi ~ expt_cond + pgsi_c
## Data: duse (Number of observations: 2278)
## Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
## total post-warmup draws = 1e+05
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.45 0.04 -0.54 -0.37 1.00 147289 81390
## phi_Intercept 1.18 0.03 1.12 1.25 1.00 192152 79733
## zipp_Intercept 1.14 0.08 0.97 1.30 1.00 137836 81842
## coi_Intercept -2.04 0.13 -2.30 -1.79 1.00 149622 78426
## expt_condBanner -0.10 0.06 -0.22 0.02 1.00 144832 85026
## expt_condPopup&Banner -0.11 0.06 -0.23 0.01 1.00 147100 84862
## pgsi_c 0.02 0.01 0.01 0.03 1.00 243969 74720
## zipp_expt_condBanner -0.10 0.12 -0.33 0.13 1.00 137959 84597
## zipp_expt_condPopup&Banner -0.02 0.12 -0.26 0.22 1.00 133246 87221
## zipp_pgsi_c 0.07 0.02 0.04 0.10 1.00 195951 77443
## coi_expt_condBanner 0.06 0.18 -0.29 0.42 1.00 150039 85441
## coi_expt_condPopup&Banner -0.04 0.19 -0.41 0.32 1.00 148403 86039
## coi_pgsi_c 0.09 0.02 0.06 0.12 1.00 187549 80180
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
As a visual convergence check, we plot the density and trace plots for the four intercept pasrameters representing the no message condition or the overall mean (for phi).
Let’s then take a look at the difference distribution of proportion bet after adjusting for PGSI:
## Joining, by = c("expt_cond", ".chain", ".iteration", ".draw")
## Joining, by = c("expt_cond", ".chain", ".iteration", ".draw")
## # A tibble: 2 × 7
## expt_cond prop_bet .lower .upper .width .point .interval
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Banner - No Message -0.0210 -0.0527 0.0110 0.95 mean qi
## 2 Popup & Banner - No Message -0.0205 -0.0527 0.0120 0.95 mean qi
## Family: zoib2
## Links: mu = logit; phi = log; zipp = logit; coi = logit
## Formula: prop_bet ~ expt_cond * pgsi_c
## phi ~ 1
## zipp ~ expt_cond * pgsi_c
## coi ~ expt_cond * pgsi_c
## Data: duse (Number of observations: 2278)
## Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
## total post-warmup draws = 1e+05
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.45 0.04 -0.54 -0.37 1.00 118011 78112
## phi_Intercept 1.19 0.03 1.12 1.25 1.00 131837 77887
## zipp_Intercept 1.13 0.09 0.96 1.30 1.00 107774 76676
## coi_Intercept -2.02 0.13 -2.29 -1.77 1.00 110396 74054
## expt_condBanner -0.10 0.06 -0.22 0.02 1.00 114287 78534
## expt_condPopup&Banner -0.11 0.06 -0.23 0.01 1.00 116914 81953
## pgsi_c -0.00 0.01 -0.03 0.02 1.00 87041 73534
## expt_condBanner:pgsi_c 0.02 0.02 -0.01 0.05 1.00 96283 79737
## expt_condPopup&Banner:pgsi_c 0.05 0.02 0.02 0.08 1.00 96405 78454
## zipp_expt_condBanner -0.08 0.12 -0.31 0.15 1.00 110693 79469
## zipp_expt_condPopup&Banner 0.00 0.12 -0.24 0.24 1.00 111341 82515
## zipp_pgsi_c 0.04 0.02 -0.01 0.09 1.00 86850 75623
## zipp_expt_condBanner:pgsi_c 0.05 0.04 -0.03 0.12 1.00 94416 79556
## zipp_expt_condPopup&Banner:pgsi_c 0.05 0.04 -0.03 0.12 1.00 96439 79162
## coi_expt_condBanner 0.03 0.19 -0.33 0.40 1.00 113047 82707
## coi_expt_condPopup&Banner -0.09 0.19 -0.47 0.29 1.00 111429 80151
## coi_pgsi_c 0.06 0.03 0.01 0.12 1.00 84501 69538
## coi_expt_condBanner:pgsi_c 0.03 0.04 -0.05 0.10 1.00 91648 76296
## coi_expt_condPopup&Banner:pgsi_c 0.04 0.04 -0.04 0.12 1.00 94486 80426
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
As a visual convergence check, we plot the density and trace plots for the four intercept pasrameters representing the no message condition or the overall mean (for phi).
Let’s then take a look at the difference distribution of proportion bet after adjusting for PGSI:
## # A tibble: 2 × 7
## expt_cond prop_bet .lower .upper .width .point .interval
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Banner - No Message -0.0209 -0.0531 0.0112 0.95 mean qi
## 2 Popup & Banner - No Message -0.0210 -0.0537 0.0116 0.95 mean qi
Let’s begin with some simple descriptive statistics of the clicks on the GamCare page.
## # A tibble: 3 × 5
## expt_cond proportion sd success n
## <fct> <dbl> <dbl> <int> <int>
## 1 No Message 0.0278 0.165 21 755
## 2 Banner 0.0289 0.168 23 796
## 3 Popup & Banner 0.0248 0.155 18 727
Model shows no obvious convergence problems:
## Family: bernoulli
## Links: mu = logit
## Formula: gamcare_click ~ expt_cond
## Data: duse (Number of observations: 2278)
## Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
## total post-warmup draws = 1e+05
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -3.57 0.22 -4.03 -3.16 1.00 62390 57127
## expt_condBanner 0.04 0.31 -0.56 0.65 1.00 67558 66366
## expt_condPopup&Banner -0.12 0.33 -0.77 0.52 1.00 65300 67457
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Let’s take a look at the predicted probabilities and differences
## expt_cond response lower.HPD upper.HPD
## No Message 0.0276 0.0167 0.0399
## Banner 0.0287 0.0181 0.0411
## Popup & Banner 0.0245 0.0143 0.0365
##
## Point estimate displayed: median
## Results are back-transformed from the logit scale
## HPD interval probability: 0.95
## # A tibble: 2 × 7
## expt_cond prob .lower .upper .width .point .interval
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Banner - No Message 0.00110 -0.0155 0.0178 0.95 mean qi
## 2 Popup & Banner - No Message -0.00304 -0.0193 0.0133 0.95 mean qi
Now the results figure:
There are total of 13590 bets. If we remove the first bet of each participant, 11891 bets remain. Of those, 29 (0.24%) bets took longer than 120 seconds. Following our pre-registration, we remove these betting times from analysis.
The following histogram shows the distribution of betting times.
We can also take a look at some descriptive statistics of the distribution:
## # A tibble: 1 × 3
## time_mean time_median time_sd
## <dbl> <dbl> <dbl>
## 1 9.69 5.85 11.2
## # A tibble: 3 × 4
## expt_cond time_mean time_median time_sd
## <fct> <dbl> <dbl> <dbl>
## 1 No Message 9.30 5.90 9.87
## 2 Banner 9.27 5.39 11.5
## 3 Popup & Banner 10.5 6.29 12.1
We analyse the betting times shown above using a shifted-lognormal model with by-participant random intercepts for the log-mean allowing both log-mean and log-SD to vary across message conditions. The following shows the model summary (which show no obvious convergence problems).
## Family: shifted_lognormal
## Links: mu = identity; sigma = log; ndt = identity
## Formula: time ~ expt_cond + (1 | ppt_id)
## sigma ~ expt_cond
## Data: times_use2 (Number of observations: 11862)
## Draws: 10 chains, each with iter = 2001; warmup = 334; thin = 3;
## total post-warmup draws = 5557
##
## Group-Level Effects:
## ~ppt_id (Number of levels: 1418)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.68 0.02 0.65 0.72 1.00 8005 11560
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.89 0.04 1.82 1.96 1.00 6868 11053
## sigma_Intercept -0.24 0.01 -0.26 -0.21 1.00 16715 16012
## expt_condBanner -0.05 0.05 -0.16 0.05 1.00 7741 11117
## expt_condPopup&Banner 0.03 0.05 -0.07 0.13 1.00 6400 10288
## sigma_expt_condBanner 0.05 0.02 0.02 0.09 1.00 16442 15747
## sigma_expt_condPopup&Banner 0.03 0.02 -0.00 0.06 1.00 16317 15688
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## ndt 0.64 0.01 0.61 0.66 1.00 16631 15956
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
The summary table shows that one of the message specific parameters for the log-SD provides evidence for a difference between the no message and message condition (the 95% CI for sigma_expt_condMessage1 does not include 0).
The model shows no obvious convergence problems:
The model is also able to adequately reproduce the shape of the observed data.
Our hypothesis is about the mean betting time, which we need to calculate from the model parameters log-mean m and log-SD sigma as mean = exp(m + sigma^2/2).
The following table shows the predicted mean betting times which are similar to the observed ones and reproduce the ordering of conditions means.
## # A tibble: 3 × 7
## expt_cond mean .lower .upper .width .point .interval
## <fct> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 No Message 9.02 8.38 9.70 0.95 mean qi
## 2 Banner 8.87 8.23 9.55 0.95 mean qi
## 3 Popup & Banner 9.48 8.77 10.2 0.95 mean qi
We can also take a look at the differences from the no message condition:
## # A tibble: 2 × 7
## expt_cond mean .lower .upper .width .point .interval
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Banner - No Message -0.156 -1.09 0.783 0.95 mean qi
## 2 Popup & Banner - No Message 0.454 -0.529 1.42 0.95 mean qi
Now the results figure:
The following histograms shows the distribution of the number of spins.
Some descriptive statistics on the number of non-zero bet counts is:
## # A tibble: 3 × 4
## expt_cond bet_count_mean bet_count_median bet_count_sd
## <fct> <dbl> <dbl> <dbl>
## 1 No Message 8.25 4 14.5
## 2 Banner 7.59 4 11.4
## 3 Popup & Banner 8.18 4 17.0
Following the preregistration, we analyse the distribution after excluding all observations with 0 spins. We then use a negative binomial model to describe the data.
This model shows no obvious convergence problems.
## Family: negbinomial
## Links: mu = log; shape = identity
## Formula: bet_count | trunc(lb = 1) ~ expt_cond
## Data: part_nozero (Number of observations: 1699)
## Draws: 4 chains, each with iter = 26000; warmup = 1000; thin = 1;
## total post-warmup draws = 1e+05
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.43 0.11 1.19 1.63 1.00 49824 48954
## expt_condBanner -0.11 0.10 -0.30 0.09 1.00 58818 60915
## expt_condPopup&Banner -0.01 0.10 -0.21 0.19 1.00 60616 59596
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 0.25 0.03 0.18 0.31 1.00 48230 46684
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
The data seems to be well described by the model.
When we zoom in (i.e., ignore data points above 50 for the plot), we can see the that the real and synthetic data match quite well.
Then let’s take a look at the predicted number of spins. To do so it is important to note that the mean parameter gives the mean of the non-truncated negative binomial distribution and not the mean of the truncated distribution.
However, we can derive the mean of the truncated distribution from first principles. More specifically, for any random variable \(X\) truncated such that \(X > y\) and with density function of the non-truncated distribution \(f(x)\) and corresponding cumulative density function \(F(x)\) its expectation \(E(X|X > y)\) (or mean) is given by (see e.g., Wikipedia) \[ E(X|X > y) = \frac{\int_y^\infty x f(x) dx}{1 - F(y)}. \] In words this formula says that the the expectation of the truncated random variable is given by the expectation derived for the truncated part of the non-truncated random variable (the numerator) divided by the probability of the truncated part.
The difficulty in calculating this expectation is of course the integral in the numerator. However, because the negative binomial distribution is a discrete probability distribution and we truncate it such that \(X > 0\) this calculation is trivial in the present case.
Note that in the case of a discrete variable, the expectation of the truncated variable becomes the following, \[ E(X|X > y) = \frac{\sum_{x=y + 1}^\infty x\, f(x)}{1 - F(y)}. \] Given this formulation it is easy to see that the expectation of the full (i.e., non-truncated) negative binomial distribution, \(E(X) = \sum_{x=0}^\infty x\, f(x)\) is equal to the term in the numerator of the truncated expectation if \(y = 0\). The reason for this is that the first term of the sum in \(E(X)\) is zero if \(x = 0\). Hence, for the negative binomial truncated at zero, the expectation is given by \[ E(X|X > 0) = \frac{E(X)}{1 - F(0)}. \] Using this formula, we can now calculate the predicted (or estimated) number of mean spins:
## # A tibble: 3 × 7
## expt_cond trunc_mean .lower .upper .width .point .interval
## <fct> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 No Message 8.28 7.44 9.23 0.95 mean qi
## 2 Banner 7.62 6.87 8.47 0.95 mean qi
## 3 Popup & Banner 8.20 7.36 9.17 0.95 mean qi
## # A tibble: 2 × 7
## expt_cond trunc_mean .lower .upper .width .point .interval
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Banner - No Message -0.661 -1.87 0.539 0.95 mean qi
## 2 Popup & Banner - No Message -0.0736 -1.34 1.20 0.95 mean qi
Now the results figure:
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8
## [4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
## [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] binom_1.1-1 emmeans_1.7.3 tidybayes_3.0.2 brms_2.16.3 Rcpp_1.0.8.3 forcats_0.5.1
## [7] stringr_1.4.0 dplyr_1.0.8 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 tibble_3.1.6
## [13] ggplot2_3.3.5 tidyverse_1.3.1 checkpoint_1.0.2
##
## loaded via a namespace (and not attached):
## [1] readxl_1.4.0 backports_1.4.1 plyr_1.8.7 igraph_1.3.0 svUnit_1.0.6
## [6] crosstalk_1.2.0 rstantools_2.2.0 inline_0.3.19 digest_0.6.29 htmltools_0.5.2
## [11] fansi_1.0.3 magrittr_2.0.3 checkmate_2.0.0 tzdb_0.3.0 modelr_0.1.8
## [16] RcppParallel_5.1.5 matrixStats_0.61.0 vroom_1.5.7 xts_0.12.1 prettyunits_1.1.1
## [21] colorspace_2.0-3 rvest_1.0.2 ggdist_3.1.1 haven_2.4.3 xfun_0.30
## [26] callr_3.7.0 crayon_1.5.1 jsonlite_1.8.0 zoo_1.8-9 glue_1.6.2
## [31] gtable_0.3.0 distributional_0.3.0 pkgbuild_1.3.1 rstan_2.21.3 abind_1.4-5
## [36] scales_1.1.1 mvtnorm_1.1-3 DBI_1.1.2 miniUI_0.1.1.1 xtable_1.8-4
## [41] tmvnsim_1.0-2 bit_4.0.4 stats4_4.1.3 StanHeaders_2.21.0-7 DT_0.22
## [46] htmlwidgets_1.5.4 httr_1.4.2 threejs_0.3.3 arrayhelpers_1.1-0 posterior_1.2.1
## [51] ellipsis_0.3.2 pkgconfig_2.0.3 loo_2.5.1 farver_2.1.0 sass_0.4.1
## [56] dbplyr_2.1.1 utf8_1.2.2 tidyselect_1.1.2 labeling_0.4.2 rlang_1.0.2
## [61] reshape2_1.4.4 later_1.3.0 munsell_0.5.0 cellranger_1.1.0 tools_4.1.3
## [66] cli_3.2.0 generics_0.1.2 broom_0.7.12 ggridges_0.5.3 evaluate_0.15
## [71] fastmap_1.1.0 yaml_2.3.5 processx_3.5.3 knitr_1.38 bit64_4.0.5
## [76] fs_1.5.2 nlme_3.1-155 mime_0.12 xml2_1.3.3 compiler_4.1.3
## [81] bayesplot_1.9.0 shinythemes_1.2.0 rstudioapi_0.13 reprex_2.0.1 bslib_0.3.1
## [86] stringi_1.7.6 highr_0.9 ps_1.6.0 Brobdingnag_1.2-7 lattice_0.20-45
## [91] Matrix_1.4-0 psych_2.2.3 markdown_1.1 shinyjs_2.1.0 tensorA_0.36.2
## [96] vctrs_0.4.0 pillar_1.7.0 lifecycle_1.0.1 jquerylib_0.1.4 bridgesampling_1.1-2
## [101] estimability_1.3 cowplot_1.1.1 httpuv_1.6.5 R6_2.5.1 promises_1.2.0.1
## [106] gridExtra_2.3 codetools_0.2-18 colourpicker_1.1.1 gtools_3.9.2 assertthat_0.2.1
## [111] withr_2.5.0 shinystan_2.6.0 mnormt_2.0.2 parallel_4.1.3 hms_1.1.1
## [116] grid_4.1.3 coda_0.19-4 rmarkdown_2.13 shiny_1.7.1 lubridate_1.8.0
## [121] base64enc_0.1-3 dygraphs_1.1.1.6